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THE  APPLICATION  OF  CORRELATION  THEORY  TO 
CERTAIN  PROBLEMS  OF  EDUCATION 

INTRODUCTION 

Within  the  past  ten  or  fifteen  years,  there  hae  been  a  tendency 
among  those  contributing  to  psychological  and  educational  literature  to  attempt 
to  measure  abilities  of  students  and  to  study  educational  facts  by  statistical 
methods.      If  there  are  abilities  that  can  be  measured,  this  method  offers  a 
hopeful  outlook  in  regard  to  settling  certain  questions  such  as  those  of  the 
correlation  of  abilities  and  the  transfer  of  training.      There  are  various 
degrees  of  merit  to  these  statistical  studies,  with  results  and  interpretations 
varying  from  those  entirely  justifiable  to  those  that  seem  wholly  unwarranted. 
It  is  the  purpose  of  this  paper  to  report  on  an  examination  of  the  literature 
in  which  mathematical  statistics  has  been  utilized  in  regard  to  the  nature  of 
the  methods  used. 

ABSTRACTS  AND  CRITICISMS  OF  ARTICLES 

In  an  article  on"A  Study  in  Formal  Discipline,"  lthe  method  is 

to  be  criticized  because  data  are  given  for  but  one  class  of  twenty-four  students 

while  in  fact  it  is  claimed  that  conclusions  are  drawn  from  ten  classes.  The 

author's  conclusions  can  not  be  checked  from  the  data  given.      In  order  to 

attempt  to  prove  his  point,  namely,  that  students  able  in  mathematical  reasoning 

2 

are  not  even  generally  able  in  practical  reasoning  and  law,  the  grades  were 
arranged  in  vertical  columns,  lines  were  drawn  emphasizing  cases  that  were 
unlike  and  disregarding  cases  which  were  alike.      His  result  is  due  to 

1.  Lewis,  F.  C.         School  Review,     Vol.  13,     pp. 281-292  (1905). 

2.  Loc.  cit.,  p. 291. 
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over-emphasis  of  caae3  favorable  to  the  author's  argument. 

The  errors  of  this  article  are  discussed  in  an  article  "On  the 
Correlation  of  the  Marks  of  Students  in  Mathematics  and  in  Law. "^      Dr.  H.  L. 
Rietz  obtained  the  data  from  Dartmouth  College  which  Mr.  Lewis  used,  and  from 
these  data  he  found  that  there  axists  a  significant  positive  correlation 
coefficient  between  the  marks  of  students  in  mathematics  and  in  law.  The 
result  which  Mr.  Lewis  obtained  was  equivalent  to  a  negative  correlation 
coefficient.      Hence,  Mr.  Lewis  drew  incorrect  conclusions  from  his  data.  In 
the  article  by  Dr.  Rietz,  the  data  in  the  form  of  correlation  tables  are  given 
for  the  years  1897  (the  year  cited  as  the  example  in  Mr.  Lewis'  paper),  1898, 
1899,  1900,  1901,  and  a  combined  table  for  the  years  1897-1901.      The  mean  grade 
in  mathematics,  the  mean  grade  in  law,  the  correlation  coefficient  to  three 
decimal  places  and  the  probable  error  are  given  for  each  table.      The  data  given 
enable  one  to  check  the  author's  results.      He  doss  not  draw  the  lines  of  re- 
gression, but  explains  thi3  omission  a3  follows:     "While  the  number  of  variates 
used  in  the  tables  for  separate  years  is  too  small  to  warrant  the  calculation 
of  the  means  of  arrays,  we  find  from  the  combined  table,  except  near  the 
extremes,  that  the  means  of  arrays  lie  fairly  near  the  line  of  regression.  We 

can  therefore  predict  fairly  well  from  the  correlation  coefficient  the  average 
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grades  of  students  in  law  for  assigned  grades  in  mathematics." 

One  of  the  most  careful  pieces  of  work  on  the  marks  of  students 
is  an  article  on  "Correlation  of  Efficiency  in  Mathematics  and  Efficiency  in 
Other  Subjects."^      The  grades  of  some  1200  students  of  the  University  of  Illinois 
were  examined  to  determine  the  correlation  between  efficiency  of  students  in 
mathematics  and  their  efficiency  in  foreign  languages  and  in  natural  science. 


1.  H.  L.  Rietz,  Journal  of  Educational  Psychology,     Feb.  1916.  pp.S7-93. 

2.  Loc.  cit.,    p. 89. 

3.  H.  L.  Rietz  and  I.  Shade.        Illinois  University  Studies,     Vol.   Ill,  No,  1. 


The  data  are  given  in  correlation  tables  and  the  lines  of  regression  ar3  drawn. 
The  correlation  coefficients  of  0.48  and  0.44  obtained  are  higher  than  those  ob- 
tained by  Pearson  for  the  correlation  between  the  statures  of  father  and  son. 
The  correlation  coefficients  for  mathematics-foreign  languages  are  given  when 
graies  at  50  per  cent  are  included  and  when  grades  at  50  per  cent  are  excluded. 
A  form  of  procedure  for  computing  the  correlation  coefficient    is  given  in 
tabular  form.      The  question  of  probable  error  is  ?lso  considered.      "Ir.  C.  V. 
Moore  has  3aid  in  regard  to  the  size  of  the  correlation  coefficients  obtained 
by  the  authors  of  this  article,  "In  view  of  the  standing  of  one  of  the  writers 
as  an  expert  on  statistical  subjects  thi3  conclusion  deserves  especial  consider- 
ation."1 

In  an  article  by  Thorndike  and  '.Voodworth,  "On  the  Influence  of 
Improvement  in  One  Mental  Function  upon  tne  Efficiency  of  Other  Functions,  M6' 
people  were  tested  upon  their  ability  to  estimate  areas.      The  subjects  were 
aided  to  the  extent  that  they  were  allowed  to  ascertain  the  real  area  after 
each  judgment.      Table  I  gives  the  average  area  before  training,  the  average 
area  after  training  and,  for  the  training  series,  the  average  error  at  the 
end  of  training,  3tated  in  square  centimeters.      In  Table  II,  the  ratio  of 
error  after  training  to  error  before  training  is  giv&n.      Tables  I  and  II  give 
data  for  six  subjects  only.      The  idea  in  this  paper  is  to  determine  whether 
the  difference  found  in  judging  magnitudes  i3  such  as  would  be  expected  from 
chance  can  be  tested  by  comparing  the  actual  differences  between  the  average 
errors  with  the  probable  differences  a3  computed  from  the  probability  curve. 
However,  sufficient  data  are  not  given  for  this  purpose.      Tne  authors  give 
curves  for  three  individuals  in  which  the  ordinates  represent  the  mean  square 
error  of  judgment  of  areas  of  ten  to  one  hundred  square  centimeters.  No 


1.  Moore,  C.  N.      Correlation  and  Disciplinary  Values,      School  and  Society 

Vol.  II,     p. 382. 

2.  Psychological  Review.        1901.      Vol.  8,  pp. 247-251;  384-395;  553-564. 
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correlation    coefficients  are  computed  although  the  authors  speak  of  the  desira- 
bility of  correlation  in  the  beginning  of  the  article. 

The  article,  "Correlation  between  Ability  to  Think  and  Ability 
to  Remember  with  Special  Reference  to  !Tnited  States  History,"^"  signed 
"Educational  Stat istician,  State  Board  of  Education,  Madison,  Wia.,"  was  read  at 
a  meeting  of  the  National  Association  of  Directors  of  Educational  Research  and 
so  it  should  be  a  carefully  prepared  article.        The  author  states,  "The 
percentages  of  correct  answers  were  converted  into  units  of  variability,  assuming 

a  normal  distribution  of  ability  in  history  for  each  of  the  highest  grades  of  the 
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elementary  school.      The  unit  chosen  was  the  probable  error" 

Later  he  says,  "The  statistical  assumptions  need  concern  us  very  little  at  this 
3 

time."  However,  it  does  not  seem  justifiable,  exdept  perhaps  for  rough 

approximations,  to  assume  a  normal  distribution  when  there  are  several  probabil- 
ity curves,  any  one  of  which  the  distribution  of  grades  might  fellow.  The 
unit  chosen  does  not  affect  the  form  of  distribution.       He  computes  the 
Pearson  correlation  coefficient  for  the  relation  between  thought  and  memory  in 
children,  but  he  gives  no  data.      This  makes  his  result  lose  a  great  deal  of 
its  value  for  there  is  no  way  of  checking  his  results,  and  often  conclusions 
which  the  writer  has  overlooked  may  be  drawn  from  the  data.      He  also  computes 
the  coefficient  by  the  method  of  unlike  signed  pairs,  securing  a  coefficient 
seven -hundredths  higher  and  with  a  smaller  probable  error.      He  maintains  that 
there  are  decided  limitations  to  the  use  of  correlation  coefficient  and  that 
the  regression  coefficient  much  better  indicates  the  extent  to  which  achievement 
in  remembering  historical  facts  may  serve  as  an  index  of  achievement  in  judgment 
or  thought  about  them.      He  gives  the  regression  coefficients  for  information 
ability  on  thought  ability  and  for  thought  ability  on  information  ability, 
but  gives  no  data  or  correlation  tables.      He  then  gives  the  regression  equation 

1.  E.  R.  Buckingham  School  &  Society.      Vol.  V,  pp. 443-449. 

2.  Loc.  cit.,  p. 443. 

3 .  "       "  " 
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formed  by  using  the  regression  coefficient.      The  author  could  have  treated  his 

subject  much  more  adequately  by  giving  his  data  in  the  form  of  correlation 

tables  and  drawing  the  lines  of  regression  so  that  the  reader  might  check  the 

computed  results,  and  determine  from  the  form  of  the  regression  curve  the 

interpretation  to  be  placed  on  the  correlation  coefficient. 

In  an  article  on  "The  Grading  of  Students,"*  Mr.  Meyer  criticizes 

the  conclusion  of  Professor  W.  S.  Hall  that  grades  of  students  when  tabulated 

yield  re3xilts  which  conform  to  the  binomial  curve.      The  author  thinks  students 

can  not  b3  graded  in  such  a  mechanical  way.      Then  he  proceeds  to  outline  a 

method  for  grading.      He  says,  "It  seems  plausible  to  start  from  the  assumption 

that  the  combined  mental  and  moral  ability  which  we  want  to  measure  i3  distri- 

2 

but sd  among  different  people  in  accordance  with  the  probability  curve."  Since 

there  are  several  probability  curves,  it  is  perhaps  an  unwarranted  assumption 

that  such  complex  things  as  gradss  should  follow  the  normal  probability  curve. 

He  gives  data  for  classes  in  the  University  of  Missouri  where  th3  students  in 

each  class  are  divided  into  three  groups,  50  per  cent  medium,  25  per  cent 

superior,  and  25  per  cent  inf3rior. 

The  article  entitled  "On  the  Significance  of  the  Teacher' 3  Appre- 
3 

ciation  of  General  Intelligence"    is  a  very  thorough  statistical  treatment  of 
the  subject.      Seventeen  tables  are  given  in  which  the  data  in  the  form  of 
correlation  tables  are  exhibited.      The  relationship  of  clothes  and  intelligence 
in  school  children  is  shown  graphically.      By  calculating  the  standard  deviations 
of  the  arrays  and  the  standard  deviation  of  the  whole  population,     the  regression 
curve  is  found  to  be  very  closely  linear.      The  correlations  are  computed  for 

1.  Max  Meyer,  Science  N.S.  XXVIII,  pp. 243-250. 

2.  Loc.  cit.,  p. 246. 

3.  Walter  H.  Gilby  assisted  by  Karl  Pearson, Biometrika  Vol.  VIII,  Part3  I  and  II, 

pp. 94-108. 
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(I)  standard  and  age,   (2)  standard  and  intelligence,   (3)  standard  and  order  in 
examination,  (4)  standard  and  percentage  of  marks,   (5)  standard  and  clothing, 
(6)  age  and  intelligence,  (7)  age  and  order  in  examination,  (8)  age  and  per- 
centage of  marks,  (9)  age  and  clothing,  (10)  intelligence  and  order  in  examina- 
tion,  (11)  intelligence  and  percentage  of  marks,  (12)  intelligence  and  clothing, 
(13)  order  in  examination  and  percentage  of  marks,  (14)  order  in  examination  and 
clothing,   (15)  percentage  of  marks  and  clothing.,   (16)  school  and  intelligence, 
and  (17)  school  and  clothing.      The  correlations  (1),(3),  (4),  (6),  (9),  (10), 

(II)  ,  (14),  (15),  wers  found  by  the  correlation  ratio  method;  (2),   (5),  (12), 
(16),  and  (17)  were  found  by  mean  square  contingency;  (7),  (S),  and  (13)  were 
found  by  the  product -moment  method.        The  conclusion  reached  is  that  the 
teacher's  judgment  of  general  intelligence  wil]  give  at  laast  an  estimate  of 
the  examination  value  of  his  pupil,  and  he  believed  it  to  be  of  even  icore 
importance. 

Karl  Pearson  has  written  a  paper  "On  the  Value  of  the  Teacher's 
Opinion  of  the  General  Intelligence  of  School  Children."^"      Adequate  data  are 
given  in  the  form  of  correlation  tables.      He  computes  correlations  by  correla- 
tion ratios  and  by  multiple  correlation.      His  conclusion  i3  that  the  teacher's 
estimation  of  general  capacity  does  mean  something  and  that  it  has  a  very  direct 
and  practical  value  when  properly  registered  and  handled. 

In  the  paper,  "On  the  Correlation  of  Mental  and  Physical  Tests,"6' 
the  author  states  that  his  data  were  obtained  from  Columbia  University.  He 
computes  correlation  coefficients  for  tests  in  Quickness  and  Accuracy,  Memory, 
Physical  Tests,  and  Class  Standing.      There  are  forty-two  coefficients  computed 
in  all.      For  comparison  purposes,  he  obtains  a  correlation  of  .66  for  weight 
and  height  of  students.      However,  he  give3  no  correlation  tables  or  data. 
There  is  therefore  no  published  information  that  enables  us  to  know  how  closely 

1.  Biometrika  Vol.  VII,  Part  IV  (KoV.  1910),  pp. 542-548. 

2.  Wissler,  Clark,  Columbia  Univ.  contrib.  to  Ed.   No.  9. 
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his  data  .vouid  give  linear  regression.      We  shall  comment  further  on  the  desira- 
bility of  knowing  that  we  have  linear  regression  in  the  interpretation  of  the 
correlation  coefficient. 

"Changes  In  the  Age  of  College  G-raduation"1    is  an  article  giving 
convincing  data.      The  calculations  are  based  on  20,000  cases  and  include  the 
graduates  of  Bieven  different  colleges.      The  results  are  given  for  decades. 
He  favors  the  use  of  the  median  age,  but  the  arithmetic  mean  of  ages  is  also 
given.      The  ages  are  furthermore  represented  graphically  in  histograms.  His 
data  3how  that  a  larger  percentage  have  graduated  under  twenty-three  in  the  later 
decades  than  in  the  earlier  decades. 

In  a  paper  on  "Correlation  between  the  Oral  and  Written  Work  of 
Pupils  in  the  Fundamentals  of  Addition,"2  by  Mr.  Earle  E.  Wilson,  he  gives 
absolutely  no  data,  but  he  gives  the  mean,  median,  average  deviation,  standard 
deviation,  probable  error,  coefficient  of  variability,  and  coefficient  of  corre- 
lation.    (Pearson  formula),  possible  error  ofr    for  both  written  and  oral  tests. 

In  a  p^per  on  "Physical  vs  Mental  Ability, n*  the  same  author  drew 

hi3  conclusions  from  the  records  of  one  hundred  twenty-eight  boys.      He  gives 

a  tabulated  distribution  of  the  boys:     (1)  athletic  ability  distribution, 

(2)  scholarship  distribution,  but  he  did  not  give  correlation  tables.      He  gives 

the  following  measures  and  the  formula  used  for  each  one  for  physical  tests  and 

scholastic  standing!       mean,    f^M;    }>     median  ^---^---J,     average  deviation 
^,    standard  deviation   f  \J  d]2    *■    d22    *  d^  +     -    -    -    -    in2  L 

probable  error  (.  6745<T),  coefficient  of  variability    1-^— y  >     coefficient  of 
correlation  (Pearson's)    ^xy         ,  and  probable  error  of  correlation 
(■«745  &  -  r°J  )■ 


1.  Thomas,  W.  Scott.     Pop.  Sci.  Mon.      Vol.  63,  pp. 159-171. 

2.  School  and  Socirty.      Vol.  V,  p.  300. 

3.  -ilson,  F.  F.      School  and  Society.      Vol.  VI,  p. 30.     (July  7,  1917). 
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The  purpose  of  the  article  "Mentality  of  Nations"1  was  to  compare 
the  States  of  the  Union  and  of  different  countries  as  to  education  and  diffusion 
of  knowledge,  and  to  determine  what  relation,  if  any,  intellectual  conditions 
may  havo  to  patho-social  and  other  conditions  in  those  countries.  "Mentality" 
is  used  to  mean  diffusion  of  education,  knowledge,  or  information  throughout 
the  population  as  a  whole.      The  statistics  are  for  the  year  1908.      The  data 
given  are  (1)  mentality  in  districts  of  the  United  States,  (2)  mentality  in 
states  of  the  United  States,  (3)  mentality  in  ten  countries  of  the  world,  (4) 
percentage  book  production  for  each  subject,  (5)  sociological  conditions  for 
ten  countries  including  births,  deaths,  number  still  born,  marriages,  emigrants; 
(5/  patho-aocial  conditions  (crime  in  general,  murder  or  suicide,  theft,  all 
offenses  and  crimes,  number  of  insane  in  institutions,  number  of  paupers,  number 
of  suicides,  number  of  illegitimate  births,  number  of  divorces).  In  this 

paper,  the  conclusions  are  drawn  by  mere  inspection  of  the  data. 

In  a  critical  paper  by  Charles  *T.  Moore,  "On  Correlation  and 
Disciplinary  Values, he  states  that  the  scientific  oasis  of  the  anti-discipli- 
narians seems  on  careful  examination  to  be  largely  imaginary.      He  discusses 
the  interpretation  to  be  given  to  the  correlation  coefficient.      In  discussing 
the  interpretation  of  fractional  values,  he  gives  examples  of  Pearson's  correla- 
tion of  0.51  for  correlation  in  stature  in  collateral  inheritance,  Wissler's 
correlation  coefficients  from  0.51  to  0.75,  with  the  exception  of  French  and 
rhetoric  which  was  0.30;  Spearman's  correlations  for  four  subjects,  Classics, 
French,  English,  and  Mathematics  from  0.64  to  0.83;  and  the  correlation  coeffi- 
cients of  0.48  and  0.44  obtained  by  fiietz  and  Shads  which  he  especially  commends. 
However,  the  anti-disciplinarians  frequently  refer  to  correlation  coefficients  in 
the  range  0.4  to  0.5  as  corresponding  to  a  lo;;  degree  of  relationship  between 

1.  MacDonald,  Arthur.        Open  Court.      Vol.  26,  pp. 449-460. 

2.  Moore,  C.  IT.    School  and  Society.      Vol.  II,  pp. 378-385. 
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the  variables  involved.    He  gives  four  correlation  tables  for  algebra-algebra 
(boys),  algebra -algebra  (girls),  geometry -English  (boys),  geometry -English  (girls), 
He  gives  the  correlation  coefficients  and  their  probable  errors  obtained  from 
the  above  mentioned  data.      He  does  not  draw  the  lines  of  regression. 

In  an  article  on  "A  Comparison  of  Elementary  and  High  School 
Grades, 1,1  the  data  were  from  the  Iowa  City  School  records.      Only  such  records 
were  included  as  show  the  completion  of  the  last  four  years  of  the  elementary 
school  and  the  first  two  years  of  high  school.      The  author  found  the  correlation 
between  the  average  elementary  school  grade  and  the  average  high  school  grade 
to  be  G.71  by  the  Pearson  formula.    From  this  fact  he  concluded  that  those  best 
in  the  elementary  school  are  best  in  high  school.      He  found  that  the  correla- 
tions between  specific  subjects  were  also  quite  high,  although  they  were 
usually  less  than  between  the  general  avsrrges.      His  correlation  coefficients 
between  the  3ame  subjects  in  different  schools,  say  between  grade  history  and 
high  school  history,  were  not  markedly  higher  and  many  .vara  not  so  high  as 
those  between  different  subjects  either  in  the  same  school  or  in  different 
schools.      He  did  not  give  correlation  tables  or  data. 

Mr.  W.  T.  Foster  in  "Should  Students  Study"2  tabulates  the  per- 
centage of  those  attaining  distinction  in  later  life  with  the  number  who  received 
first,  second,  third,  fourth,  and  pass-degree  honors.      He  draws  conclusions 
from  mere  inspection  of  the  data.      He  says,  "A  similar  correlation  is  found 
between  the  degree  of  success  of  undergraduates  at  Oxford  and  their  subsequent 
distinction  as  clergymen."         He  doe3  not  use  the  term  correlation  in  the  sense 
of  mathematical  correlation. 

4 

Caroline  Burke's  article  on  "The  Collecting  Instinct"  gives  data 
for  the  number  of  collections  different  children  have  acquired.      The  average 


1.  Miles,  W.  R.      Pedagogical  Seminary.     Dec.  1910.     Vol.  XVII,  pp. 439-450. 

2.  Pop.  Sci.  Mon.     Vol.  133,  pp. 509-613. 

3.  Loc.  cit.,  p. 617.      4.    Ped.  Sem.  Vol.  7,  pp. 179-207. 
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number  of  collections  to  a  child  is  computed.      Conclusions  are  drawn  by  mere 

inspection  of  the  data. 

In  the  article,  "Early  Interests:  Their  Permanence  and  Relation 

to  Abilities, "^    it  is  stated  that  the  data  are  from  three  hundred  forty  students, 

but  no  data  are  given.    Coefficients  of  correlation  are  given  for  correlation 

of  elementary  interest  with  high  school  interest,     r  -  0,85;  of  elementary 

interest  with  college  interest,     r  =  0.66;  and  of  high  school  interest  with 

college  interest,  r  «  0.79.      The  article  lacks  weight  because  no  data  are  given 

with  which  to  check  the  conclusions. 

Irving  King  and  Morris  Adelstein  have  also  written  on  "The  Perma- 

2 

nence  of  Interests  and  their  Relation  to  Abilities."         Their  data  are  from 
one  hundred  forty  University  of  Iowa  students,  but  the  data  are  not  exhibited. 
Coefficients  of  correlation  are  computed.      The  number  of  students  involved 
is  rather  small  to  make  the  results  conclusive. 

In  a  paper  on  "Correlation  between  Reading  Tests  and  General 
Ability,"      the  author  says,  "The  value  of  a  test  depends  upon  the  accuracy  with 
which  it  measures  some  specific  ability  and  the  extent  to  which  it  gives  clearly 
uniform  results  under  similar  conditions."         In  education  the  experimenter 
is  never  quite  sure  of  identical  conditions.    Dr.  King  of  the  State  University 
of  Iowa  set  out  to  find  whether  there  was  correlation  between  the  Kansas  silent- 
reading  tests  and  other  subjects.      Median  grades  by  sexes  are  given  for  the 
ninth,  tenth,  eleventh  and  twelfth  grades  of  the  Iowa  City  High  School.  The 
number  of  stuients  tested  is  not  given.     The  results  are  given  in  the  form: 
students  averaging  E  in  school  work  mads  a  median  grade  of  42.0  in  the  reading 
test,  students  averaging  G  made  a  median  grade  of  41.3,  those  averaging  M  made 

1.  Thorndiks,  E.  L. :    School  and  Society.      Vol.  V,  pp. 178-179. 

2.  King,  Irving  and  Adelstein,  Morris.      School  &  Societies.     Vol.  VI,  pp.359~35C 

3.  School  Review:    Editorial  comment.      Vol.  XXV,  pp. 57-59. 

4.  Loc.  cit.,  p. 57. 


a  median  grade  of  25. 0,  those  averaging  P  made  a  median  grade  of  21.3,  those 
averaging  F  made  a  median  grade  of  14.0.      The  coefficients  of  co-ordination 
(Spearman     Foot-rule  formula)  for  freshmen  engineers  between  the  Kansas  silent- 
reading  tests  and  hard-opposites  tests  are  found  to  be  R  -  0.18;  between  Kansas 
silent -reading  tests  and  scholastic  rank,  R  -  0.12;  and  between  hard-opposites 
tests  and  scholastic  rank,  R  -  0.42.      Although  no  data  or  correlation  tables 
are  given,  the  author  says,  "It  will  be  seen  from  the  above  that  the  results  of 
the  hard-opposites  tests  are  a  much  better  measure  of  the  sort  of  ability  that 
is  expressed  in  class  ranks  than  is  the  Kansas  silent-reading  test."1  This 
conclusion  should  be  supported  by  data  in  order  to  be  convincing.      Other  coeffi- 
cients of  correlation  are  given  for  Junior  and  Senior  liberal  arts'  students, 
but  no  data  is  given  for  them.    The  author  states  that  these  Spearman  coefficients 
of  coordination    may  be  transmitted  into  approximate  coefficients  of  correlation 
by  multiplying  each  by  the  factor  1.5.      This  rule  is  likely  to  give  only  a 
rough  approximation  to  the  value  of  the  correlation  coefficient. 

In  the  article,  "The  Relative  Proficiency  of  University  Students 
in  an  Elementary  Course  in  Zoology/'2  the  author  begins  by  stating  that  he  has 
data  on  the  grades  of  six  hundred  fifty  students  who  have  taken  the  elementary 
courses  in  zoology  at  the  University  of  Illinois.      He  expresses  the  results 
obtained  as  curves  in  which  the  grades  are  used  a3  abscissae,  and  the  percentages 
of  students  in  the  various  years  Of  work  are  given  as  ordinates.      None  of  the 
data  is  given  with  which  to  check  up  the  correctness  of  the  curves.      This  gives 
the  reader  no  opportunity  to  draw  his  own  conclusions  from  the  author's  data. 
The  fact  is  mentioned  that  many  educators  agrs3  that  the  distribution  of  grades 
should  conform  to  the  normal  distribution  curve,  but  the  author  does  not  believe 

1.  Loc.  cit.,  p. 59. 

2.  Van  Cleave,  H.  J.        School  and  Society.      Vol.  V,  pp. 356-360. 
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this  assumption  i3  justifiable  in  grading.      None  of  the  curves  he  obtained 
follows  the  normal  distribution  curve. 

In  the  book,  "The  f'oiern  High  School,  "^thers  is  given  a  table 
on  "Correlation  of  Subjects  Taught  with  the  High  School  Teachers*  Specific  Pre- 
paration  for  Teaching  in  Twenty  Towns  of  "A"  file.,"        It  is  as  follows: 


Correlation 

No  Correlation 

Totals 

I 

II 

III 

Totals 

48 
1 
1 

52 
15 
1 

100 
16 
2 

50 

68 

118 

It  is  not  clear  what  the  author  means  by  "correlation"  and  "no  correlation" 
here.      Ihey  evidently  are  not  correlation  coefficients.      The  value  of  this  work 
would  be  increased  by  giving  an  explanation  of  the  sense  in  which  the  term 
correlation  i3  used. 


The  article  on  "The  Relationships  between  the  Abilities  Involved 

in  the  Study  of  the  Grammar  School  Subjects, "^  contains  the  statement  "that 

the  ratings  of  the  individuals  of  a  class  follow  the  normal  distribution  curves 
4 

fairly  well"    and  further  states,  "Thi3  is  good  evidence  of  the  validity 

4 

of  the  ratings."         This  is  open  to  criticism  because  Pearson  has  developed 
twelve  types  of  probability  curves  and  distributions  of  such  complex  thing3 
as  grades  are  perhaps  as  likely  to  follow  some  other  type3  as  to  follow  the 
normal  probability  curve.      It  is  therefore  a  highly  questionable  method  that 
bases  the  validity  of  the  ratings  on  the  fact  that  the  grades  follow  the  normal 

1.  Johnston,  C.  H.  and  others:    The  Modern  High  School. 

2.  Loc.  cit.,  p. 146. 

3.  Smith,  A.  G-.      Columbia  Univ.  Contrib.  to  Ed.     Vol.  11,  No.  2. 

4.  Loc.  cit.,  p.  11. 


distribution  curve.     In  fact,  the  distributions  should  be  examined  critically 

on  this  point.      In  ascertaining  the  answers  to  questions  lik3  the  following: 

"How  far  does  ability  in  English  imply  ability  in  geography?"    he  used  the 

Pearson  correlation  coefficient  as  the  best  measure  of  the  relationship.  In 

explaining  the  interpretation  of  the  correlation  coefficient,,  he  makes  the  follow 

in?  statement:    "Thus,  0  correlation  between  English  and  arithmetic  would  mean 

that  E's  (excellent),  G!s  (good),  P!s  (passable),  VTs  (unsatisfactory),  and  B*s 

(bad)  in  English  would  all  do  equally  well  in  arithmetic."         He  gives  the 

correlation  coefficients  found  between  different  subjects  for  boys,  girls,  and 

the  average  between  them.      In  hi3  "note"  the  author  says  that  E,  G-,  P,  V,  and 

B  are  assigned  the  values  of  positive  and  negative. x  and  y  in  term3  of  the 

"probable  error  as  a  unit,  which  they  wouii.  have  if  the  abilities  in  question 

were  distributed  according  to  the  normal  frequency  curve."'       However,  the  form 

of  the  curve  is  not  affected  by  the  unit  chosen.      The  author  also  says,  "The 

details  of  the  calculation  can  readily  be  surmised  by  those  acquainted  with 

statistics,  and  would  not  interest  others."*       Now  those  familiar  with  statistic 

would  have  more  faith  in  the  coefficients  computed  if  the  data  were  given  with 

which  to  check  the  results  and  if  the  lines  of  regression  in  the  correlation 

tables  were  drawn,  and  those  unfamiliar  with  statistics  could  gather  at  least 

some  information  from  a  correlation  table  were  it  given.      The  false  assumption 

is  again  used  in  testifying  in  regard  to  the  grades  used:    "As  pertinent  evidence 

to  their  validity,  it  will  be  noticed  that  here  likewise  the  rating  follow 

5 

roughly  a  normal  distribution  scheme."  This  hardly  constitutes  evidence 

that  they  were  a  satisfactory  set  of  grades. 

1.  Loc.  cit.,  p.  4 

2.  Loc.  cit.,  p. 12. 

3.  Loc.  cit.,  p. 13 

4.  Loc.  cit.,  p. 13. (Note) 

5.  Loc.  Cit.,  p. 13 


In  "The  Correlations  of  the  Abilities  Involved  in  Secondary  School 
Work,"1    the  author  compares  the  school  marks  of  children  of  the  same  parentage 
to  find  a  measure  of  heredity  as  a  factor  in  education.      He  computes  correlation 
coefficients  for  different  subjects  between  pairs  of  children  of  the  3ams  parents. 
He  states  that  hi3  data  involved  different  systems  of  grading,  but  the  marks 
could  be  mr.de  cowiensurate  if  the  three  hypotheses  are  accepted,  namely  (1)  that 
the  marks  give  the  relative  positions  of  the  pupils  within  the  group,  (2)  that 
the  abilities  in  the  school  subjects  follow  the  normal  type  of  distribution,  and 
(3)  that  high  school  students  represent  a  random  picking  from  the  total  group 
of  boys  and  girls.      His  data  are  not  given.      He  farther  expresses    the  idea 
that  the  second  hypothesis  is  almost  surely  true.      This  does  not  seem  to  be  a 
justifiable  contention. 

The  data  for  the  article,  "The  Relationships  between  the  Abilities 
Involved  in  Secondary  School  Subjects,"**    was  obtained  from  the  marks  of  students 
in  the  New  York  Regents'  Examination.      In  computing  correlation  coefficients, 
only  the  upper  fifty  per  cent  of  the  grades  were  used  in  order  to  eliminate  the 
inaccuracy  due  to  the  fact  that  failures  in  the  examination  were  not  always 
recorded.      The  grades  of  boys  and  girls  were  treated  together.      He  expresses 
the  thought  that  the  mixture  does  not  produce  any  considerable  amount  of  spurious 
correlation  because  the  differences  between  the  sexes  in  the  degree  and  variability 
of  ability  are  slight.      The  article  lacks  data.      Furthermore,  the  influence  of 
selecting  only  the  upper  half  of  the  grades  would  have  a  marked  effect  on  the 
correlation  coefficient.*' 

"The  Inheritance  of  the  Ability  to  Learn  to  Spell"^  is  a  more 
carefully  prepared  article.  The  data  are  given  which  is  very  commendable  in 
as  much  as  so  many  writers  omit  it.      The  Pearson  coefficient  of  correlation  for 

1.  Burri3,  W,.  P.      Columbia  Univ.     Contrib.  to  Ed.     Vol.  XI,  No.  2. 

2.  Erinkerhoff,  E.  C. ,     Morris,  G-. ,  Thorndike,  F.  L.        Columbia  Univ.  Contrib.  to 

Ed.     Vol.  XI,  No.  2. 

3.  Pearson,  Karl:  Th3  Influence  of  Selection  on  Correlation  and  Variation.  Phil. 

4.  Earle,  1**1*'  cSkbf  a%iv?'iol$rlc.  to  Ed.     Vol.   XI,  No.  2. 


brother-brother,  sister-sister,  and  brother-sister  relationships  is  computed 
for  School  A  and  School  B.      The  author  states  that  the  distribution  only  roughly 
approximates  to  the,  normal  distribution.      The  lines  of  regression  which  are 
important  in  interpreting  the  correlation  coefficient  are  not  drawn. 

The  purpose  of  the  paper,  "Correlation  of  some  Psychological  and 
Educational  Measurements,  with  special  attention  to  the  Measurement  of  Mental 
Ability,"'1'  was  to  discover  the  intercorrelation3  of  some  recently-developed 
educational  and  vocational  tests  and  certain  psycho-physical  tests.      He  found 
that  the  Cancellation  Tests  used  correlated  negatively  with  all  those  tests  which 
proved  to  be  good  measures  of  mental  ability.      A  "Table  A"  is  exhibited  in  which 
Average  Raw  Pearson  coefficients  and  Corrected  Pearson  coefficients  are  given, 
but  no  data  are  shown. 

In  his  paper,  "The  Spelling  Ability  of  University  Students," 
the  author  gives  data  for  two  classes.      In  finding  the  relation  between  general 
scholarship  and  spelling  ability,  he  gives  a  table  in  which  the  number  of  cases 
for  scholarship,  90$  or  above,  80$  to  89$,  79$  or  below,  and  the  average  number 
of  words  misspelled  in  daily  work  and  in  examination  are  given.      By  observation 
of  the  table,  he  concludes  there  is  a  substantial  correlation.      A  table  is  aleo 
given  in  which  the  type  of  error  is  expressed  in  percentages. 

In  the  article,  "The  Relation  of  Point-Scale  Measurements  of 
Intelligence  to  Educational  Performance  in  College  Students,"      the  mean,  the 
per  cent  of  maximum  variation,  mean  variation,  and  per  cent  of  mean  variation 
for  men  and  women  in  the  statistical  results  of  Point -Scale  Examinations  are 
given.      The  data  are  not  arranged  in  correlation  tables.      However,  the  authors 
compute  correlation  coefficients  to  draw  some  of  their  conclusions,  and,  in  one 

1.  McCall,  W.  A.        School  and  Society.      Vol.  V,  pp. 24-30. 

2.  Brandenburg,  G.  C.        School  ft  Society.      Vol.  VII,  pp. 26-29. 

3.  Yerkes,  Robert  M. ,    Burtt,  Harold  E. :    School  ft  Society.      Vol.  V,  pp. 535-540. 
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class  conclude  by  observation  that  the  correlation  is  positive.      Other  results 
are  expressed  in  percentages  in  tertils3,  and  in  percentages  in  quartiles. 

In  the  paper,  "Standing  of  Undergraduates  and  Alumni, previous 
articles  comparing  the  euccess  of  alumni  with  their  scholarship  in  college  are 
discussed.      The  author  uses  histograms  to  show  hi3  data,  in  which  the  successive 
tenths  of  the  class  are  laid  off  on  the  abscissa  axis  and  the  percentage  of  the 
total  number  of  successful  alumni  in  each  tenth  of  the  class  is  shown  on  the 
ordinate  axis. 

The  data  in  the  article,  "The  Relative  Standing  in  College  of 

2 

Graduates  Entering  Various  Professions,"    are  arranged  in  tabular  form.  The 
college  grades  are  arranged  in  tertiles,  then  the  par  cent  in  the  lowest,  middle, 
and  highest  tertile  is  given  opposite  the  ten  occupations  listed  for  English, 
philosophical  Studies,  science  (including  mathematics,  chemistry,  physics, 
engineering,  astronomy),  social  science  (including  economics,  hietory,  government), 
and  foreign  languages.      Conclusions  are  drawn  from  observation  of  the  data  without 
any  formal  statistical  treatment. 

The  article  on  "The  Permanence  of  Interests  and  their  Relations 

3 

to  Abilities"    is  interesting  reading,  but  the  statistical  treatment  of  the 
subject  is  not  convincing.      One  hundred  individuals  were  ask3d  to  judge  them- 
selves concerning  the  order  of  both  their  interests  and  abilities  in  mathematics, 
history,  literature,  science,  music,  drawing,  and  other  handwork  (defined  as 
carpentering,  sewing,  gardening,  cooking,  carving,  etc.)  at  threa  periods,  namely 
during  the  last  three  years  of  the  elementary  school  psriod,  the  high  school  perio4 
and  the  college  period.      These  were  to  be  ranked  in  order  one  to  s6ven.  These 
numbers  are  subtracted  to  find  the  difference  between  elementary  interest  rank 

1.  Yerkes,  Robert  M. ,  Burtt,  Harold  E. :    School  &  Society.      Vol,  V,  pp. 535*540. 

2.  Paull,  Charles  H.      School  and  Society.      Vol.  V,  pp. 628-630. 

3.  Thorndike,  E.  L.        Pop.  Sci.  Mon.      Vol.  51,  pp. 449-456, 
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and  high  school  interest  rank.      The  permanence  of  interest  is  then  determined 

from  the  sum  of  these  seven  differences  as  compared  to  the  sum    of  the  differences 

if  there  had  been  no  change  (sum,  zero),  the  sum  if  there  had  been  a  maximum 

change  (sum,  twenty-four),  and  the  sum  if  the  relative  strength  of  their  interests 

had  changed  at  random  (sum,  sixteen).         In  regard  to  the  number  16,  a  number 

which  represents  rank  ha3  a  pretty  large  fluctuation  in  sampling.      Indeed  it  is 

hard  to  estimate  what  this  amounts  to.      The  author  states,  ni?cr  the  permanence 

from  the  elemsntary-school  period  to  the  junior  year  of  college  or  professional 

school  in  my  hundred  individuals  this  figure  is,  on  the  average  9,  three-fifths 

of  the  individuals  showing  sums  of  from  6  to  12  for  column  2  of  Table  3."^ 

2 

Table  3    which  is  for  one  individual  is  as  follows: 


Difference  Between 
Elementary  Interest 
Rank  and  High  School 
Interast  Rank 

Difference  Between 
Elementary  Inter- 
est Rank  and 
College  Interest 
Rank 

Difference  Between 
High  School  Interest 
Rank  and  Collage 
Interest  Rank 

Mathematice 

0 

1 

1 

History 

3 

0 

3 

Literature 

0 

0 

0 

Science 

3 

1 

2 

Mus  ic 

2 

0 

2 

Drawing 

1 

0 

1 

Other  hand -work 

1 

0 

1 

10 

2 

10 

1.  Loc.  cit.,  pp. 452-453. 

2.  Loc.  cit.,    p. 452. 
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The  figure  9  seems  to  be  a  very  rough  approximation.      The  author  then  expresses 
the  thought  that  this  average  result  of  9  can  be  expressad  as  a  coefficient  of 
correlation  equivalent  to  over  .60  and  interprets  it  as  meaning  six- tenths  of 
perfect  resemblance.      It  is  not  at  all  clear  how  the  author  passes  over  from 
an  average  result  of  9  to  a  correlation  equal  to  .60.        Indeed  there  are  so 
many  pit-falls  in  correlation  theory  that  it  is  very  doubtful  whether  this 
statement  can  be  accepted  as  correct.       At  least  good  statistical  practice 
would  demand  that  the  method  be  snown  by  which  the  value  .60  can  be  legitimately 
obtained  from  the  data.      Furthermore,  no  data  for  the  one  hundred  individuals 
are  given  and  we  do  not  know  whether  the  regression  was  linear.      Then,  the  fact 
that  r  is  .60  does  not  mean  that  it  shows  six-tenths  of  perfect  resemblance. 
The  author  also  states,  "A  sum  of  differences  of  3  means  a  resemblance  greater 
than  half  of  perfect  resemblance,  as  the  reader  expert  in  the  mathematics  of 
probability  will  realize."1       That  treatment  will  not  stand  critical  examination. 
The  sums  12,  10,  8,  and  6  are  converted  into  mean  coefficients  of  resemblance 
or  correlation  of  +.33,  +.55,  +.71,  and  +.83  respectively.      He  finds  that 
the  correlation  between  an  individual's  order  of  subjects  for  interest  and  his 
order  for  ability  to  be  .91.      Then  he  reaches  the  conclusion  that  a  person's 
relative  interests  are  an  extraordinary  accurate  symptom  of  his  relative  capaci- 
ties.     This  conclusion  doe3  not  have  much  weight  on  the  basis  of  his  treatment. 

In  the  article,  "Examinations,  Grades,  and  Credits,"2  the  author 
says,  "In  so  far  as  students  are  graded  on  the  lines  of  the  probability  curve, 
this  may  measure  the  attitude  of  the  examiner  rather  than  the  distribution  of 
the  men  in  merit. "^       Curves  are  exhibited  showing  (1)  the  distribution  of 


1.  Loc.  cit.,  p. 453. 

2.  Cattell,  J.  Pop.  Sci.  Mon.      Vol.  66,  pp. 367-378. 

3.  Loc.  cit.,  p. 372. 
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the  stature  of  women  in  inches  (data  from  Karl  Pearson),  (2)  the  theoretical 
distribution  of  grades,   (3)    the  moat  convenient  distribution  in  practice,  (4) 
the  distribution  of  the  average  grades  assigned  in  five  courses,  and  (5)  the 
distribution  of  grades  of  the  College  Entrance  Examination  Board.      Both  the 
histogram  and  smooth  curve  for  each  are  drawn.      For  the  curve  showing  the 
distribution  of  the  average  grades  assigned  in  five  courses,  the  data  are 
given  for  two  hundred  3tudents  expressed  in  the  percentage  of  students  receiving 
A,  B,  C,  D,  and  F  for  English  A,  English  B,  Mathematics  A,  History  A,  and 
Economics  A.      Aside  from  this,  he  gives  no  data  for  the  curves.      The  conclu- 
sions are  based  upon  observation  of  the  data  and  curvas. 

The  purpose  of  the  article,  "Spelling  Ability  -  Its  Measurement 
and  Distribution,"^"  is  stated  to  be  to  derive  a  3Cale  fcr  the  measurement  of 
spelling  ability  and  to  show  some  of  its  uses  and  applications.      Complete  data 
for  the  first  and  second  preferred  lists  of  words  are  given.      In  computing 
correlation  between  grades,  the  author  used  Spearman's  "  'Foot-rule1  for 
Measuring  Correlation"  and  then  converted  the  values  obtained  into  Pearson 
coefficients  of  correlation  by  using  the  table  in  G-.  M.  Whipple's  "Manual 
of  Mental  and  Physical  Tests."      Then  an  attempt  is  made  to  justify  the  use 
of  the  Spearman  coefficient  by  computing  the  correlations  for  the  same  data 

by  the  'product  moment'  method  (    =*    ^xy     )    and  by  the  unlike  signs  method 

*  fli  6~2 

(  t*  -  cos  TTv).       The  average  of  tha  three  coefficients  for  each  grade  is  then 

found,  and  the  average  of  these  averages  i3  computed.      In  comparing  the  average 

of  these  averages  with  the  average  value  of  the  coefficients  found  by  each  of 
the  three  methods,  the  author  found  that  the'average  of  the  average'  was  nearer 

the  value  of  the  average  of  the  Spearman  coefficients.      Tha  argoment  in  favor 
of  the  Spearman  coefficient  does  not  seem  justifiable  because  spurious  correla- 
tion is  likely  to  enter  in  when  taking  30  many  averages.      A  good  rcany  correlation 

1.    Buckingham,  B.  R.      Columbia  Univ.      Contrib.  to  Ed.  No.  59. 
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coefficients  are  computed  for  different  3Chools  and  grades  before  the  conclusion 
is  drawn  that  the  preferred  lists  were  well  selected.      Histograms  are  drawn  to 
show  the  frequency  of  each  rating  in  the  different  grades. 

Thus  far  in  this  paper,  there  are  offered  criticisms  of  statistical 
methods  that  have  been  used  in  the  study  of  educational  problems.      We  propose 
next  to  explain  briefly  the  theory  of  what  seems  to  be  the  most  satisfactory 
methods  of  studying  correlation  as  it  is  involved  in  educational  problems,  and 
to  illustrate  by  some  applications  to  grades  of  students, 

THE  CORRELATION  COEFFICIENT 

Two  associated  classes  of  variables  are  said  to  be  correlated  when, 
if  values  are  assigned  to  one  variable,  the  values  of  the  other  variable  are  such 
that  their  mean  values  are  functions  of  the  assigned  values.      Our  problem  is  to 
determine  with  what  accuracy  we  can  predict  mean  values  of  the  associated  variable 
y  from  assigned  values  of  a  variable  x.       In  other  words,  do  high  values  of 
one  variable  tend  to  go  with  high  values  or  with  low  values  of  the  other  variable, 
or  is  there  no  such  tendency? 

The  first  important  point  in  correlation  theory  is  the  determina- 
tion of  the  function  which  expresses  the  relation  between  x  and  y,  when  x  and  y 
are  associated  classes  of  variables.      Let  y  =»  8(x)  be  the  function  which  gives 
the  mean  value  of  y  corresponding  to  a  selected  x.      In  this  equation,  if    Q  (x) 
is  not  aero  for  all  values  of  x,  there  is  said  to  be  correlation  between  x  and 
y,      Now  suppose  we  have  the  following  system  of  associated  values:      (x^  y), 

(*>  j"t) »  (*7  f)    (*n>  ^  which  ars  actual  measurements. 

Those  data  should  be  arranged  in  a  double  entry  table  called  a  correlation  table, 
in  which  the  values  of  x  and  y  are  arranged  along  the  horizontal  and  vertical 
axes  respectively.        In  this  table  a  vortical  column  under  x  is  known  as 
an  x-array  of  y's.        If  correlation  exists,  it  has  been  found  that 


21. 

the  points  which  are  the  mean  value  of  each  x-array  do  not  lie  at  random  over 

the  field,  but  arrange  themselves  more  or  less  in  the  form  of  a  smooth  curve 

called  the  "curve  of  regression"  of  y  on  x.      It  has  been  found  that  in  a  large 

number  of  cases  this  curve  is  approximately  a  straight  line.      fthen  the  means  of 

the  z-arrays  lie  exactly  on  the  line,  the  regression  is  said  to  be  truly  linear. 

We  shall  consider  the  case  where  the  curve  is  approximately  a 

straight  line.      Hence  y  =  ©(x)  =  mx  +  b.      Since  we  wish  to  determine  m  and  b 

so  that  the  y's  calculated  from  the  equation  will  deviate  as  little  as  possible 

from  the  ma&n  of  the  observed  values,  it  is  necessary  that  we  adopt  some  criteria 

as  to  least  deviation.      As  it  is  more  convenient  to  deal  with  deviations  from 

mean  values,  let  (xx,  yi)     (xj,  Y2)  rrr   (xj,,  yn)  be  deviations  from 

the  mean,  and  let  the  co-ordinate  axes  be  taken  through  the  mean  of  x  and  y« 

If  we  adopt  the  least  squares  criteria,  namelv  that  the  sum  of  the  squares  of 

deviations  from  the  mean  shall  be  a  minimum,  the  summation      ^Ln+ (■=;+  -  mx.  -  b) 

tM  t^yt  t 

must  be  a  minimum.      Differentiating  this  function  partially  with  respect  to  m 

and  equating  it  to  zero  we  obtain 

-2  £nt  4  (  ft  -  mx;  -  b)  *  0  (1) 
-2    Znt  -  mxt    -    b)  =  0  (2) 

Since  the  origin  is  taken  at  the  mean,  2n.         3  0  and  ^n^x^  =  0  and  therefore 

b  =  0(  equation  (2).)         Equation    (1)  now  becomes  (2Ly).  x+  -  m  Z.  +  T)+\-« 

t=!  ^  fsf     *  • 

Since  x    and  y    are  taken  as  deviations  from  the  mean, 

equation  (3)  may  be  expressed         d  =  n  d  =  n     ,  (4). 

d  -  I  d  -  1 

If  <Tx  and  (Ty  represent  the  standard  deviation  of  the  x  system  and  y  system  of 

variates  respectively,  «       d  =  n  % 

d  =  i  n 
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n  d  -  n 

Substituting  these  values  in  equation  (4),  we  find  m~    d  38  i 


d  = 


xd  yd 


d  =  1 


*d 


Now  we  may  write  the  equation  of  the  lines  of  regression  as 

Z.xd  7d    <^  fly 
y  =     u    a   Y_   x^or  y  -  r  M* 


x,  in  which  the  correlation  coefficient, 

n<rx<Ty^  <TX 
d  =  n 

^L.  *d  yd 

r  -    a  ~  1  •  In  a  similar  manner,  we  find  the  equation  of  the  line 

n<Tx  (Ty 

(TV 

of  regression  of  x  on  y  to  be  x  -  r        *  y. 


The  standard  deviation  of  an  array  is  given  by  the  equation 

(y  -  r   «V  x)2  .  Zl£  -  Z*d  ^  ♦    r2  0? 

n  —  - 

a  <9»-8  r3  <Ty8  +  rV=  ^  3    flT(l  -  r2) . 
(y  -  r  x  )2 

Since   i3  always  positive,  the  value  (1  -  r  )  must  be 

n 

positive.       Hence  r  is  such  that 

-  1^  r-£  +  l. 


Interpretation  of  r. 

2 

If  r  s  1,  the  values  of  y  computed  from  assigned  values  of  x 
lie  on  the  curve  of  regression.      If  r  =  +■!,  the  correlation  is  perfect  andpcaitive 
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while  if  r  -  -1,  the  correlation  is  perfect  and  negative.      If  r  is  positive, 
large  values  of  the  one  variable  tend  to  be  associated  with  large  values  of  the 
other  variable.      On  the  other  hand,  if  r  is  negative,  large  values  of  one 
variable  tend  to  go  with  3mall  values  of  the  other  variable.      If  the  variables 
are  functionally  independent    r  =  0,  but  the  converse  is  not  true. 

The  existence  of  correlation  does  not  require  that  r  have  any 
assigned  value  such  as  0,5.      The  existence  of  correlation  depends  rather  on  a 
comparison  of  the  magnitude  of  r  with  its  probable  error,  and  on  a  knowledge  of 
the  existence  of  linear  regression.      Thus  r  =  0.1  with  a  probable  error  of 
0.0001  means  that  there  is  no  reasonable  ioubt  as  to  the  existence  of  correlation. 

THE  CORRELATION  RATIO 

If  the  curve  of  regression  is  not  linear,  r  can  not  be  regarded 
as  a  satisfactory  treasure  of  the  amount  of  correlation.      In  this  case  the 
correlation  ratio^"  7  i9  U9ed«      *n  accurate  work  it  is  advisable  to  compute  p  as 
well  as  r.      The  quantity       (  ^  -  r    )    affords  a  measure  of  the  linearity  of 
the  regression.^1        The  correlation  ratio  is  always  greater  than  the  correlation 
coefficient,  except  when  the  regression  is  linear,  and  in  this  ca3e  J  ~  r. 

SUGGESTED  METHOD  I\T  CORRELATION  STUDIES  ON  GRADES 
We  shall  now  illustrate  the  U33  of  these  methods  as  applied  to  the 
correlation  between  the  grades  of  students.      The  data  are  grades  obtained  from 
the  Urbana  (Illinois)  High  School.      In  the  following  tables  are  given  the 
crrelatiDr.  tables  for  Freshman  English  with  the  average  of  second,  third,  and 
fourth-yoar  English;  of  Freshman  Algebra  with  the  average  of  3econd,  third,  and 
fourth -year    English;  of  Freshman  English  with  Plane  Geometry;  and  of  Freshman 

Draper's  Company  Research  Memoirs. 2.  Biometric  Series  II.      XIV,  pp. 9-11. 
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Algebra  with  Plane  Geometry.      With  these  tables  the  correlation  coefficient  and 

its  probable  error  are  computed,  the  lines  of  regression  are  drav/n,  the  two 

?  2 

correlation  ratios  for  each  are  computed,  and  the  value  (p    -  r  )  is  determined. 
From  these  values  we  have  all  that  is  essential  for  the  interpretation  of  the 
correlation. 

2  2 

From  an  examination  of  (7    -  r  )  in  each  of  the  following  tables, 
it  appears  that  we  heve  linear  regression  in  these  cases.      The  amount  of  corre- 
lation is  therefore  well  described  by  the  correlation  coefficient,  and  we  may 
predict  from  assigned  grades  the  mean  value  of  associated  grades  by  the  use 
of  the  correlation  coefficient. 
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TABLE  III 
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SUMMARY 

In  summarizing  the  methods  which  educators  have  used,  the  following 
general  criticisms  should  be  made.      One  great  difficulty  that  should  be  recog- 
nized is  that  in  rcany  of  the  investigations  we  accept  as  measurements  a  set  of 
numbers  that  perhaps  serve  at  best  to  put  individuals  in  an  order  or  rank  rather 
than  as  a  measurement  of  a  character.      However,  assuming  that  we  can  accept  the 
data  as  measurements  of  characters,  we  may  characterize  the  statistical  methods 
in  the  following  ways.      In  some  instances,  the  data  are  merely  inspected,  and 
conclusions  are  drawn  without  submitting  a  tabulation  of  the  data.      This  method 
results  in  many  erroneous  conclusions  which  may  long  pass  for  scientific  results. 
In  other  cases,  the  data  are  represented  by  histograms  or  curves,  and  the 
conclusions  obtained  are  based  on  an  inspection  of  the  curves.      The  U3e  of  the 
figures  is  nearly  always  a  useful  method  of  picturing  correlation,  Dut  it  can 
hardly  lead  to  numerical  results.      Furthermore,  there  is  the  method  of  computing 
correlation  coefficients  without  exhibiting  the  data  in  correlation  tables  or 
in  tabular  form,  and  without  indicating  the  nature  of  the  line3  of  regression. 
Comments  on  the  use  of  the  latter  method  and  its  defects  occur  frequently  in 
this  paper. 
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